Adaptable image search with computer vision assistance

ABSTRACT

A computing device having adaptable image search and methods for operating an image recognition program on the computing device are disclosed herein. An image recognition program may receive a query from a user and a target image within which a search based on the query is to be performed using one or more of a plurality of locally stored image recognition models, which are determined to be able to perform the search with sufficiently high confidence. The query may comprise text that is typed or converted from speech. The image recognition program performs the search within the target image for a target region of the target image using at least one selected image recognition model stored locally, and returns a search result to the user.

BACKGROUND

Image searching technologies may enable a user to obtain informationabout an object in an image or locate a specific object within theimage. The same process may be applied to people, scenes, text, etc.Typical image recognition services operate by receiving an image fromthe user, analyzing the image for distinctive features, and thenmatching the object in the image against images in a database usingalgorithms.

As digital camera sensors and memory capacity have improved, the sizesof the images captured by digital cameras have increased. Currently,some camera-equipped smartphones capture images of over 40 megapixels.Uploading an image of this size to a cloud-based service usually takessignificant time and bandwidth, especially if done over a cellularnetwork, which often incurs additional cost to the user. Once such alarge image is uploaded, an image recognition service may take extratime and computational power to process the image as compared to asmaller image, which slows response time down. Additionally, since theimage is sent over a network, issues related to privacy can arise. As aresult, significant challenges exist for cloud-based image searchservices to be applied to large images captured on next generationcameras.

SUMMARY

A computing device having adaptable image search and methods foroperating an image recognition program on the computing device aredisclosed herein. One disclosed embodiment may include non-volatilememory configured to store a plurality of image recognition models andthe image recognition program executed by a processor of the computingdevice. The image recognition program may receive a query from a userand a target image within which a search based on the query is to beperformed. The query may comprise text that is typed or converted fromspeech.

The image recognition program may then rank the image recognition modelsby confidence level for performing the search within the target imageand determine whether any of the image recognition models is above aconfidence threshold for performing the search locally on the processorof the computing device. If it determines that at least one of the imagerecognition models is above the confidence threshold, the imagerecognition program may select at least one highly ranked imagerecognition model. Then, the image recognition program may perform thesearch within the target image for a target region of the target imageusing at least one selected image recognition model, and finally, returna search result to the user.

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used to limit the scope of the claimed subject matter. Furthermore,the claimed subject matter is not limited to implementations that solveany or all disadvantages noted in any part of this disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic view of a computing device performing a localimage recognition search.

FIG. 2 is a schematic view of the computing device of FIG. 1 performinga web-based image recognition search.

FIG. 3 is a flowchart of a method for operating an image recognitionprogram on the computing device of FIG. 1 or other suitable hardware.

FIG. 4 is a flowchart that expands upon a step of the flowchart of FIG.3, and illustrates a method for downloading an image recognition modelfrom a web service.

FIG. 5 is a flowchart that expands upon a step of the flowchart of FIG.3, and illustrates a method for creating a new image recognition modelbased on example images from a web image search.

FIG. 6 shows a simplified schematic view of a computing system includingthe computing device.

FIG. 7 illustrates an example use case scenario of an image recognitionsearch for a red coffee mug.

FIG. 8 illustrates another example use case scenario of an imagerecognition search for a particular book.

FIG. 9 illustrates another example use case scenario of an imagerecognition search for an electronics store at a mall using a malldirectory.

DETAILED DESCRIPTION

FIGS. 1 and 2 are schematic views of a computing device 10 configuredwith adaptable image search functionality that is able to perform alocal image recognition search that uses different models stored locallyto conduct the image search, and also optionally to conduct a web basedimage recognition search. In one embodiment, the computing device 10 isconfigured to present the user with an option for local imagerecognition search, and if such a local image recognition search cannotbe performed with high confidence, then instead present an option to theuser for conducting a web-based image recognition search, orprogrammatically perform a web search without requiring the user toselect a web search option. It will be appreciated that by affording theuser the option to first attempt to conduct image recognition searcheslocally in this manner, the computing device 10 potentially addressesthe challenges discussed above involving the transmission of large sizeimages over a network to a web based image recognition server. In someembodiments, the option for a web based image recognition search is notpresented until after the local search has been ruled out as notavailable, and in other embodiments, the option for both web based andlocal searching are presented to the user concurrently from thebeginning of the image search interaction dialog.

FIG. 1 shows computing device 10 presenting the user with options ofperforming an image recognition search on the web or locally for atarget image 12 displayed on the computing device 10. The user mayselect the target image 12 from a suitable source, such as a cameraoutput or data store in non-volatile memory 20 on the computing device10. A plurality of image recognition models 22 may also be stored in thenon-volatile memory 20. Each image recognition model 22 may include animage recognition algorithm, an optical character recognition (OCR)algorithm, and/or a keyword matching algorithm, among others. Each imagerecognition model 22 may contain only one algorithm or any combinationof multiple algorithms of the same or differing type.

An image recognition program 24 executed on processor 26 of thecomputing device 10 may display an image search GUI on display 32, whichmay include a graphical user interface (GUI) selector labeled LOCAL.Selection of the LOCAL selector by a user may trigger a local imagerecognition search. Alternatively, the local image recognition searchmay be selected using another type of command such as a voice or gesturecommand.

The image recognition program 24 may be configured to receive a query 28from a user. An input device 30 of computing device 10 may include amicrophone, a keyboard, a touchscreen, etc. The query 28 may be, forexample, text that is typed on the keyboard or touchscreen, convertedfrom speech captured by the microphone, converted via optical characterrecognition (OCR) from an image that may be, for instance, captured bythe camera 34 or stored in the non-volatile memory 20, or produced byother techniques. Audio, text, etc. may also be stored in thenon-volatile memory 20 in advance and then used to form a query 28.Alternatively, the query 28 may be an image or video of a target objectthe user is interested in finding. Multiple images or frames of videomay depict different viewpoints of the same target object. The user mayoptionally select a bounding box within the query image to help theimage recognition program 24 to locate the target object, especially ifthere are many irrelevant objects in the image.

The image recognition program 24 may also receive a target image 12within which a search based on the query 28 is to be performed. Asdescribed above, the target image 12 is typically preselected by theuser, and may originate from an onboard camera, or may be selected froma stored folder of images, etc., and the search is to find the targetobject, etc., that may be located within the target image 12. Thelocation of the target object, etc., within the target image 12 may bereferred to as a target region of the target image 12.

Next, the image recognition program 24 may rank the image recognitionmodels 22 by confidence level for performing the search based on thequery 28 within the target image 12, then determine whether any of theimage recognition models 22 is above a confidence threshold forperforming the search locally on the processor 26 of the computingdevice 10. Upon determining that at least one of the image recognitionmodels 22 is above the confidence threshold, the image recognitionprogram may select at least one highly ranked image recognition model22′ and perform the search within the target image 12 for a targetregion of the target image 12 using at least one selected imagerecognition model 22′.

The confidence level of the image recognition models 22 may beinfluenced by a number of factors. For instance, the image recognitionprogram 24 may run one or more light weight processes (i.e., lesscomputationally intensive algorithms) to classify objects in the targetimage 12 and/or the query 28. One example of such a light weight processmay be a face detection algorithm to detect whether any faces arepresent in the target image 12. If the query 28 is resolved to be aperson's name, then a light weight process could be run to determinewhether any faces are present in the image, and if so, then one or moreof the image recognition models 22 containing a more complex facialrecognition algorithm may be selected for performing the search of theimage for the person matching the name in the query. In another example,if the query 28 is the name of a brand of cereal, then image recognitionmodels 22 including 2-D image matching algorithms configured to detectrectangular shapes of a particular color may be determined to have ahigher confidence level.

The image recognition models 22 may also include text descriptions thatmay be compared to the query 28. Additionally, the image recognitionprogram 24 may show the user the image on which one of the imagerecognition models 22 was based. For example, if the query 28 is “bass,”then the user may be shown images of a fish and a musical instrument ondisplay 32 and the user may choose the one to which he was referring. Insuch a manner, the image recognition program 24 may optionally suggestmultiple image recognition models 22 with high confidence levels to theuser, and the user may select at least one image recognition model 22′for performing the search. Having a plurality of image recognitionmodels 22 stored on the computing device 10 and ready to be used even inthe absence of a network connection may speed up the search. Differentimage recognition models 22 may work better for different queries 28,and using the right selected image recognition model 22′ for the job maylead to less time and computational power being spent on performing thesearch.

Finally, the image recognition program 24 may return a search result 18to the user, which may include displaying it on display 32 and endingthe search. The visual displaying of the search result 18 may beaccompanied by an audio alert or reporting of the search result 18, or avibration, for example. At any time prior to receiving the search result18, the user may indicate to the image recognition program 24 that shewishes to end the search.

The computing device 10 is depicted as a smartphone in this embodiment,but it may be any suitable device, including other mobile devices suchas a tablet or laptop computer; a wearable device such as a watch,head-mounted display (HMD) device, or other wearable computing device;or a stationary device such as a desktop computer. The image recognitionprogram 24 may display the search result 18 on the display 32, which maybe an organic light emitting diode (OLED) display, liquid-crystaldisplay (LCD), or head-up display (HUD), for example. The computingdevice 10 may also comprise a camera 34, and the target image 12 may becaptured by the camera 34, as in this embodiment. The target image 12may be a single image or one or a plurality of image frames thatconstitute a portion of a video.

Secondary signals such as global positioning system (GPS) data andwireless network service set identifiers (SSIDs) associated with knowngeographic locations, for example, may be associated with the imagerecognition models 22. The image recognition program 24 may use locationinformation about the user or other users via these associated secondarysignals. For instance, the image recognition program 24 may check to seeif other users have submitted a similar query and where they were (e.g.,as determined by an SSID known to a geolocation service) when they foundan object that the user is searching for, and then relay thisinformation to the user. Such an object may be a physical, inanimateobject, but it may also be a person, animal, scene, portion of text,etc. Secondary signals may aid in image recognition model 22 confidenceranking and selection by providing context for the search. For example,if the user is determined through secondary signals to be at a museum,image recognition models with 2-D image matching algorithms may behighly ranked by the image recognition program 24 to accurately completea search for specific paintings.

The other users discussed above whose data may be used to conduct theimage search may be anonymous users that have submitted feedback to aweb service associated with the image recognition program 24, otherusers in the user's contacts, or other users in the user's social medianetworks, for example. In another example, the image recognition program24 may not begin processing the query 28 until the user's currentlocation as determined through a GPS signal has reached a predeterminedlocation where the image recognition program 24 should look for theobject the user desires.

In some instances, the target image 12 may be an image or video of theuser's current surroundings and the query 28 may indicate a desiredproduct or destination. For example, the user, Jane, may set up ashopping list at home using a wireless network. In this case, the imagerecognition program 24 may receive one query 28 per item on the shoppinglist. Jane may capture images of items that are running low with thecamera 34 on her HMD device, and then the image recognition program 24may use OCR to extract text from the images and form a number of queries28. Alternatively, Jane may capture one single image of a writtenshopping list or of all of the items she wants to find at the store andeach query 28 may be generated from the same image. The imagerecognition program 24 may ready the selected image recognition model22′ for each item as described above.

Once the image recognition program receives an indication that Jane hasarrived at the store, whether automatically by receiving the GPS signalor geolocated SSID or by Jane directly commanding the image recognitionprogram 24 to proceed, the HMD device may use its camera 34 to send aconstant video feed or intermittently captured images to the imagerecognition program 24 for processing. In this case, the target regionof the target image 12 may contain an image of one of the items on theshopping list. When the search result 18 is returned to Jane—forinstance, as the target image 12 displayed on display 32 with the targetregion circled and an accompanying beep sound—Jane may indicate to theimage recognition program 24 that the item just located may be removedfrom the shopping list. In this manner, Jane may finish her shopping.

In another example, the user, Kazu, may have a dinner appointment at arestaurant in an unfamiliar area. Kazu may indicate or otherwisedescribe the restaurant to his computing device 10 with a HUD on his carwindshield through a microphone. The image recognition program 24 mayconvert the speech to text to form a query 28. A suitable imagerecognition model 22′ may be determined and selected as described above.The image recognition program 24 may use Kazu's associated GPS signaland an estimated or determined location of the restaurant to indicate toKazu that he is nearing his destination.

Using GPS alone may not always end in a successful trip, with imprecisedirections and sudden movements made in response by the user. However,Kazu's computing device 10 may take single images or a video of thestreet he is on and the image recognition program 24 may use thoseimages or frames of the video as target images 12. When the imagerecognition program 24 locates the target region of the target image 12containing the restaurant, the search result 18 may be displayed onKazu's HUD, clearly indicating exactly where the restaurant is locatedso that Kazu may safely and easily arrive at his destination.

In another example, the query 28 may include a directive to search fortarget regions in the target image 12 or in other images that aresimilar to the target region in the target image 12. In such a case, theuser may already have or know of one object that she does not wish tofind. Rather, she wants recommendations for a similar object. Therecommendations may be chosen by the image recognition program 24performing a simple web search or searching through aggregated feedbackfrom multiple other users, for example. The image recognition program 24may present the user with multiple options and allow the user to chooseone or more recommended objects, or the image recognition program 24 maydetermine a most desirable object or a few desirable objects.Alternatively, the query 28 may be formed based on one or more images ofthe object that the user does not wish to find, but is instead similarto the object that the user does wish to find. If an image recognitionmodel 22′ with a confidence level above the predetermined threshold doesnot exist on the computing device 10, then one may be downloaded orcreated as described below in detail with reference to FIGS. 4 and 5.

Turning now to FIG. 2, if the local search option is determined to beunavailable due to a determination by the local image recognitionprogram 24 that an estimated degree of confidence in locally availablesearch models is below a minimum threshold, then the user may bepresented with a GUI selector labeled WEB, by which the user may choosea web based image recognition search. Of course, as discussed above, theWEB selector may be presented to the user earlier in the interactionprocess, to afford the user the option to forgo local searching andproceed directly to a web based search. Furthermore, in someembodiments, the web based search may be programmatically executed upondetermining that a local search option is not available with the minimumdegree of confidence, without necessitating user input via the WEBselector.

Upon receiving a user selection of the WEB selector, the computingdevice 10 is configured to send the target image 12 preselected by theuser to a web image recognition service 14, which executes the requestedsearch using an associated image database 16. The target image 12 istypically sent over a network connection 11, such as a cellular networkconnection coupled to a wide area network (WAN) such as the Internet.The server generates search results 18 and sends them back to thecomputing device 10 via the network 11. The computing device 10 may thenreceive search results 18 and display the search results 18 on a display32.

It will be appreciated that by providing both a local search option anda web search option in the above described manner, the challengesdiscussed in the Background may be mitigated. Specifically, with localsearch the target image 12 is not sent over a network, saving bandwidthand time, and as a result, possibly saving additional user fees. Second,local search offers the user robust privacy protection for the user'sdata by keeping the target image stored locally without sending it to aweb service. Third, for the operator of the search service, server-sideresources are potentially saved if users conduct searches locally ratherthan on the servers of the search service.

FIG. 3 is a flowchart of a method 300 for operating an image recognitionprogram on a computing device having adaptable image search. Thefollowing description of method 300 is provided with reference to thesoftware and hardware components of the computing device 10 describedabove and shown in FIG. 1. It will be appreciated that method 300 mayalso be performed in other contexts using other suitable hardware andsoftware components.

With reference to FIG. 3, at 302 the method 300 may include receiving aquery from a user. The query may comprise text that is typed, convertedfrom speech captured by a microphone, converted from an image viaoptical character recognition (OCR), or produced by other techniques. At304 the method 300 may include receiving a target image. The targetimage may be an image or video within which a search based on the queryis to be performed. For example, the target image may be an image orvideo of the user's current surroundings and the query may indicate adesired product or destination.

The computing device may comprise a camera, and the target image may becaptured by the camera. Alternatively, the target image may be selectedfrom a stored folder of images, etc. The computing device may be asmartphone, tablet, laptop computer, other mobile device, watch,head-mounted display (HMD) device, other wearable computing device, ordesktop computer, for example.

At 306 the method 300 may include ranking a plurality of imagerecognition models by confidence level for performing the search basedon the query within the target image. The image recognition models maybe stored in non-volatile memory of the computing device. Each imagerecognition model may include one or more image recognition algorithms,an optical character recognition (OCR) algorithm, and/or a keywordmatching algorithm, among others. Image recognition algorithms mayinclude face matching algorithms, two-dimensional (2-D) model matchingalgorithms, three dimensional (3-D) model matching algorithms, neuralnet algorithms, image segmentation algorithms, barcode recognitionalgorithms, quick response (QR) code detectors and decoders, and/orother algorithms, for example. At 308 the method 300 may includedetermining whether any of the image recognition models is above aconfidence threshold for performing the search locally on a processor ofthe computing device.

The confidence level of the image recognition models may be influencedby a number of factors, as described above. For instance, the imagerecognition program may run one or more light weight processes, forexample, a face detection algorithm, to classify objects in the targetimage and/or the query. The image recognition models may also includetext descriptions that may be compared to the query. Additionally, theimage recognition program may show the user the image on which one ofthe image recognition models was based. In such a manner, the imagerecognition program may optionally suggest multiple image recognitionmodels with high confidence levels to the user, and the user may selectat least one image recognition model for performing the search.

Upon determining that at least one of the image recognition models isabove the confidence threshold (yes at 308), at 310 the method 300 mayinclude selecting at least one highly ranked image recognition model.Multiple image recognition models may be ranked above the confidencethreshold and some or all of such image recognition models may beselected for use in the same search. Alternatively, only the highestranked image recognition model may be selected, or the user may choosefrom among a selection of highly ranked image recognition models. At 312the method 300 may include performing the search within the target imagefor a target region of the target image using at least one selectedimage recognition model. In addition to the image recognition models,the image recognition program may also use location information aboutthe user or other users via secondary signals associated with the imagerecognition models, such as GPS data and SSIDs.

At 314 the method 300 may include returning a search result to the user,which may include the image recognition program displaying the searchresult on a display such as a liquid-crystal display (LCD) or a head-updisplay (HUD), for example. After this step, user feedback may berecorded and used to improve accuracy of image recognition modelselection. On the other hand, upon determining that none of the imagerecognition models is above the confidence threshold (no at 308), themethod 300 may proceed to checking a web service as illustrated at step400, or to creating a new image recognition model as illustrated at step500.

FIG. 4 is a flowchart illustrating substeps of the step 400 of checkinga web service of a method 300 of FIG. 3, for retrieving new imagerecognition models from the web service. At 402 the step 400 may includeconnecting to the web service. The web service may include a pluralityof image recognition models. At 404 the step 400 may include receiving aranking of image recognition models of the web service by confidencelevel for performing a search based on the query within the targetimage. At 406 the step 400 may include determining whether any of theimage recognition models of the web service is above a confidencethreshold for performing the search, as described above with respect tomethod 300.

Upon determining that at least one of the image recognition models ofthe web service is above the confidence threshold (yes at 406), at 408the step 400 may include selecting at least one highly ranked imagerecognition model of the web service. At 410 the step 400 may includedownloading at least one selected image recognition model to thecomputing device. As above in method 300, one or multiple imagerecognition models may be selected and used in the search. At 412 thestep 400 may include performing the search within the target image for atarget region of the target image using at least one downloaded imagerecognition model. At 414 the step 400 may include returning a searchresult to the user. On the other hand, a “no” decision at 406 mayinclude proceeding to create a new image recognition model asillustrated at step 500.

Alternatively to or in conjunction with step 400, new, popular,promoted, or otherwise selected image recognition models may bedelivered to the computing device on a basis other than case-by-case.For example, new or improved image recognition models may be packaged asupdates for the image recognition program and the user may be promptedto download the updates at regular intervals, the image recognitionprogram may be configured to download the updates automatically, or theuser may be able to choose among model packages based on what appeals tohim. Updating the image recognition models may include updating,replacing, or adding to the algorithms included with the imagerecognition models, for example. Image recognition models may also bedownloaded or updated on the device based on secondary signals such asGPS data, SSIDs associated with known geographic locations, ambientnoise, etc.

FIG. 5 is a flowchart of a step 500 of method 300 of FIG. 3 for creatinga new image recognition model. For instance, in the above example ofJane going shopping, perhaps Jane wishes to buy one new item that shehas only heard about from a friend. She may not have a suitable imagerecognition model for the new item on her HMD device and the web servicemay not yet have one available for download, either. In this case, Janemay wish to create a new image recognition model.

At 502 the step 500 may include performing a web image search in animage search engine, wherein the web image search is based on the query.At 504 the step 500 may include returning a predetermined number ofexample images. The predetermined number may be set by the user or theimage recognition program. At 506 the step 500 may include selecting atleast one example image from the predetermined number of example images.The selection may occur by allowing the user to indicate which of theexample images best represents the query, or by the image recognitionprogram choosing intelligently according to programmatic rules and/oruser preferences. If the image recognition program chooses, userfeedback may be recorded at this step to improve this function.

Programmatic rules may take into account context of previous queries.For example, if the previous 10 queries from the user contained books,and the current query likely indicates either a book or a car, the imagerecognition program may choose the example image of a book. In anotherexample, programmatic rules may take into account secondary signals. Ifthe GPS data indicates that the user is at a bookstore, then the imagerecognition program may choose the example image of a book. If the useris at a car dealership, then the image recognition program may choosethe example image of a car.

At 508 the step 500 may include creating an image recognition modelbased on at least one selected example image. The user may optionallyenter a text phrase or caption to describe the created image recognitionmodel. At 510 the step 500 may include storing the created imagerecognition model in the non-volatile memory of the computing device.The created image recognition model may optionally be uploaded to theweb service to be shared with other users. At 512 the step 500 mayinclude performing the search within the target image for a targetregion of the target image using the created image recognition model. At514 the step 500 may include returning a search result to the user.

Next, some example use case scenarios will be described with referenceto illustrations in FIGS. 7-9. These use case scenarios are exemplary innature and are not intended to be used to limit the scope of the claimedsubject matter.

First, FIG. 7 illustrates an example image recognition search for a redcoffee mug by a user, Ariana. Ariana has a favorite red coffee mug shelikes to use every morning. Lately, she has been using a blue one thatshe doesn't like as much. One day, she may decide to use her favoritered coffee mug again and be unable to remember where she left it.

First, Ariana may open the image recognition program installed on hersmartphone. The image recognition program may prompt her to submit aquery, so Ariana may interact with a digital keyboard by using atouchscreen on her smartphone to type “red coffee mug.” Additionally,Ariana may take a picture of the room where she thinks she left the redcoffee mug using a built-in camera on her smartphone. The imagerecognition program may process the picture of Ariana's room as a targetimage, using the typed text as a query.

The image recognition program may select a suitable image recognitionmodel for finding the red coffee mug and perform the search for Ariana.Selecting only a suitable image recognition model may aid the imagerecognition program in locating the red coffee mug and not the blue oneinstead. Once the search completes, the image recognition program maydisplay the picture Ariana took and circle the location of the redcoffee mug. In case Ariana is not paying attention, the smartphone mayvibrate to alert her.

FIG. 8 illustrates an example image recognition search for a book by thesame user. This time, Ariana may be at a bookstore and not have accessto her bookshelf. She may see that a book, Bartleby, the Scrivener, ison sale at the bookstore and wish to purchase a copy for herself.However, she may not be sure whether or not she already owns a copy ofthe book. Luckily, Ariana may still have the picture she took of herroom saved on her smartphone. She may try zooming in on the books on herbookshelf to read the titles on the spines of the books, but she may bein a hurry and not wish to read all of the titles.

Instead, Ariana may submit the same picture, this time saved on hersmartphone, to the image recognition program. She may again use thetouchscreen to type a query, this time, “Bartleby, the Scrivener.” Inthis instance, Ariana may not have a suitable image recognition modelalready loaded on her smartphone, so the image recognition program mayconnect to the web service via a wireless network. The image recognitionprogram may select and download an image recognition model with an OCRalgorithm that may extract text from the spines of the books in thepicture and compare them to the text Ariana submitted. Once the searchcompletes, the image recognition program may display the picture Arianatook and circle the location of only the book she is looking for. Asbefore, the smartphone may vibrate to alert her so that Ariana will notbuy another copy of the same book.

FIG. 9 illustrates an example image recognition search for anelectronics store in a mall by another user, Ramzi. Ramzi is in a mallin a foreign country and he does not know the local language very well.He may need a replacement cord for an electronic device, so he may wishto visit an electronics store. Ramzi may walk up to a mall directorywhile wearing his HMD goggles and look at a map of the mall. The imagerecognition program may take frames from a video captured by the gogglesand use at least one of the frames as a target image. An on-boardmicrophone in the goggles may be used to capture Ramzi's verbal query of“electronics store,” which may then be converted to a text query.

A suitable image recognition model may be selected by the imagerecognition program. The image recognition program may then use an OCRalgorithm to extract text from the video frame and then translate thedirectory. A listing determined to correspond with an electronics storemay be highlighted on Ramzi's HMD and an audio alert such as a beep,chirp, or verbal reporting of the result may be produced by the goggles.Additionally, since Ramzi may actually want the location of theelectronics store on the map of the directory, not just the listing, theimage recognition may return the search results to Ramzi by circling thelocation on the map corresponding to the listing of the electronicsstore on the HMD. The circle around the location may flash, be in adistinctive color, or otherwise be set apart from the surrounding viewthrough Ramzi's HMD. The circle may be the only virtual content on theHMD of Ramzi's goggles, overlaid on his view of the directory throughthe HMD and following the recognized location as Ramzi moves his head.

The above described systems and methods may be used to perform an imagesearch locally in a bandwidth efficient manner based on contextdependent search models, when the system is confident that one of themodels will deliver high quality search results. This approach has thepotential advantages of bandwidth savings, search accuracy, privacyprotection, and distributed processing to ease the burden on centralizedservers.

As a variation on the above described embodiments, the query 28 mayinclude a secondary image selected by the user. The secondary image maybe captured by the camera 34, or selected from an image collection, suchas a photo collection stored locally on the computing device 10. Thesecondary image may be of an object, which may be an inanimate objectsuch as a book or building, an animate object such as a person oranimal, a place or scene, etc. The secondary image may have metadataassociated with it, which was previously entered by a user, orprogrammatically generated by an image capture device or image analysisprogram. For example, the secondary image may be of a face of arecognized person and the metadata may be the person's name or otheridentifier, the image may be a recognized book and the metadata may be atitle captured by OCR, the image may be of a recognized object and themetadata may indicate the type of object (e.g., “red coffee mug”), orthe image may be a recognized place and the metadata may be the GPScoordinates of the place and the name of the place, as but a fewexamples. When using the secondary image as the query 28, typically themetadata in textual form is used as a text query for the image search.According to this embodiment, in the flowchart of FIG. 3, step 302 isaccomplished by receiving a secondary image having associatedpre-existing or programmatically generated metadata, as described above,and the search query is based on the metadata. Following step 302, theremaining steps proceed as described above.

In some embodiments, the methods and processes described herein may betied to a computing system of one or more computing devices. Inparticular, such methods and processes may be implemented as acomputer-application program or service, an application-programminginterface (API), a library, and/or other computer-program product.Further, in some embodiments, all or a portion of the methods andprocesses may be implemented in hardware, such as in an applicationspecific integrated circuit (ASIC), field programmable gate array(FPGA), etc.

FIG. 6 schematically shows a non-limiting embodiment of a computingsystem 600 that can enact one or more of the methods and processesdescribed above. Computing device 10 connected to server 602 via network604 may take the form of computing system 600. Computing system 600 isshown in simplified form. In different embodiments, computing system 600may take the form of one or more personal computers, server computers,tablet computers, home-entertainment computers, network computingdevices, gaming devices, mobile computing devices, mobile communicationdevices (e.g., smartphone), and/or other computing devices.

Computing system 600 includes a logic subsystem 606 and a storagesubsystem 608. Computing system 600 may optionally include a displaysubsystem 610, input subsystem 612, communication subsystem 614, and/orother components not shown in FIG. 6. Server 602 may have an additionalcommunication subsystem 616, logic subsystem 618, and storage system 620and be configured to host a web service 622 as described above.

Logic subsystem 606 includes one or more physical devices configured toexecute instructions. For example, the logic subsystem may be configuredto execute instructions that are part of one or more applications,services, programs, routines, libraries, objects, components, datastructures, or other logical constructs. Such instructions may beimplemented to perform a task, implement a data type, transform thestate of one or more components, achieve a technical effect, orotherwise arrive at a desired result.

The logic subsystem may include one or more processors configured toexecute software instructions. Additionally or alternatively, the logicsubsystem may include one or more hardware or firmware logic subsystemconfigured to execute hardware or firmware instructions. Processors ofthe logic subsystem may be single-core or multi-core, and theinstructions executed thereon may be configured for sequential,parallel, and/or distributed processing. Individual components of thelogic subsystem optionally may be distributed among two or more separatedevices, which may be remotely located and/or configured for coordinatedprocessing. Aspects of the logic machine may be virtualized and executedby remotely accessible, networked computing devices configured in acloud-computing configuration.

Storage subsystem 608 includes one or more physical devices configuredto hold instructions executable by the logic subsystem to implement themethods and processes described herein. When such methods and processesare implemented, the state of storage subsystem 608 may betransformed—e.g., to hold different data.

Storage subsystem 608 may include removable and/or built-in devices.Storage subsystem 608 may include optical memory (e.g., CD, DVD, HD-DVD,Blu-Ray Disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM,etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive,tape drive, MRAM, etc.), among others. Storage subsystem 608 may includevolatile, non-volatile, dynamic, static, read/write, read-only,random-access, sequential-access, location-addressable,file-addressable, and/or content-addressable devices.

It will be appreciated that storage subsystem 608 includes one or morephysical devices. However, aspects of the instructions described hereinalternatively may be propagated by a communication medium (e.g., anelectromagnetic signal, an optical signal, etc.) that is not held by aphysical device for a finite duration.

Aspects of logic subsystem 606 and storage subsystem 608 may beintegrated together into one or more hardware-logic components. Suchhardware-logic components may include field-programmable gate arrays(FPGAs), program- and application-specific integrated circuits(PASIC/ASICs), program- and application-specific standard products(PSSP/ASSPs), system-on-a-chip (SOC), and complex programmable logicdevices (CPLDs), for example.

The terms “module,” “program,” “subsystem,” and “engine” may be used todescribe an aspect of computing system 600 implemented to perform aparticular function. In some cases, a module, program, subsystem, orengine may be instantiated via logic subsystem 606 executinginstructions held by storage subsystem 608. It will be understood thatdifferent modules, programs, subsystems, and/or engines may beinstantiated from the same application, service, code block, object,library, routine, API, function, etc. Likewise, the same module,program, subsystem, and/or engine may be instantiated by differentapplications, services, code blocks, objects, routines, APIs, functions,etc. The terms “module,” “program,” “subsystem,” and “engine” mayencompass individual or groups of executable files, data files,libraries, drivers, scripts, database records, etc.

It will be appreciated that a “service”, as used herein, is anapplication program executable across multiple user sessions. A servicemay be available to one or more system components, programs, and/orother services. In some implementations, a service may run on one ormore server-computing devices.

When included, display subsystem 610 may be used to present a visualrepresentation of data held by storage subsystem 608. This visualrepresentation may take the form of a graphical user interface (GUI). Asthe herein described methods and processes change the data held by thestorage subsystem, and thus transform the state of the storage machine,the state of display subsystem 610 may likewise be transformed tovisually represent changes in the underlying data. Display subsystem 610may include one or more display devices utilizing virtually any type oftechnology. Such display devices may be combined with logic subsystem606 and/or storage subsystem 608 in a shared enclosure, or such displaydevices may be peripheral display devices.

When included, input subsystem 612 may comprise or interface with one ormore user-input devices such as a keyboard, mouse, touch screen, or gamecontroller. In some embodiments, the input subsystem may comprise orinterface with selected natural user input (NUI) componentry. Suchcomponentry may be integrated or peripheral, and the transduction and/orprocessing of input actions may be handled on- or off-board. Example NUIcomponentry may include a microphone for speech and/or voicerecognition; an infrared, color, stereoscopic, and/or depth camera formachine vision and/or gesture recognition; a head tracker, eye tracker,accelerometer, and/or gyroscope for motion detection and/or intentrecognition; as well as electric-field sensing componentry for assessingbrain activity.

When included, communication subsystem 614 may be configured tocommunicatively couple computing system 600 with one or more othercomputing devices. Communication subsystem 614 may include wired and/orwireless communication devices compatible with one or more differentcommunication protocols. As non-limiting examples, the communicationsubsystem may be configured for communication via a wireless telephonenetwork, or a wired or wireless local- or wide-area network. In someembodiments, the communication subsystem may allow computing system 600to send and/or receive messages to and/or from other devices via anetwork such as the Internet.

It will be understood that the configurations and/or approachesdescribed herein are exemplary in nature, and that these specificembodiments or examples are not to be considered in a limiting sense,because numerous variations are possible. The specific routines ormethods described herein may represent one or more of any number ofprocessing strategies. As such, various acts illustrated and/ordescribed may be performed in the sequence illustrated and/or described,in other sequences, in parallel, or omitted. Likewise, the order of theabove-described processes may be changed.

The subject matter of the present disclosure includes all novel andnonobvious combinations and subcombinations of the various processes,systems and configurations, and other features, functions, acts, and/orproperties disclosed herein, as well as any and all equivalents thereof.

The invention claimed is:
 1. A computing device having adaptable imagesearch, the computing device comprising: non-volatile memory configuredto store a plurality of image recognition models; an image recognitionprogram executed by a processor of the computing device, the computingdevice being a user computing device, and the image recognition programconfigured to: receive a query from a user, the query comprising textthat is typed or converted from speech; receive a target image withinwhich a search based on the query is to be performed; rank the imagerecognition models by confidence level for performing the search basedon at least a comparison between the query and respective textdescriptions of the image recognition models; determine whether theconfidence level of any of the image recognition models is above aconfidence threshold; and upon determining that at least one confidencelevel of the image recognition models is above the confidence threshold,select at least one of the image recognition models whose confidencelevel is above the confidence threshold; perform the search within thetarget image for a target region of the target image using at least oneselected image recognition model locally on the processor; and return asearch result to the user.
 2. The computing device of claim 1, whereinthe target image is a single image or one or a plurality of image framesthat constitute a portion of video.
 3. The computing device of claim 1,wherein each image recognition model includes at least one of thefollowing: an image recognition algorithm, an optical characterrecognition (OCR) algorithm, and a keyword matching algorithm.
 4. Thecomputing device of claim 1, wherein the image recognition program useslocation information about the user.
 5. The computing device of claim 1,wherein the target image is an image or video of the user's currentsurroundings and the query indicates a desired product or destination.6. The computing device of claim 1, wherein the computing devicecomprises a camera and wherein the target image is captured by thecamera.
 7. The computing device of claim 1, wherein the computing deviceis a smartphone or tablet.
 8. The computing device of claim 1, whereinthe computing device is a watch or other wearable device.
 9. Thecomputing device of claim 1, wherein the image recognition programdisplays the search result on a head-up display.
 10. The computingdevice of claim 1, wherein the query includes a directive to search fortarget regions in the target image or in other images that are similarto the target region in the target image.
 11. A method for operating animage recognition program on a computing device having adaptable imagesearch, the method comprising: executing the image recognition programon a processor of the computing device, the computing device being auser computing device; receiving a query from a user, the querycomprising text that is typed or converted from speech; receiving atarget image within which a search based on the query is to beperformed; ranking a plurality of image recognition models by confidencelevel for performing the search based on at least a comparison betweenthe query and respective text descriptions of the image recognitionmodels, wherein the image recognition models are stored in non-volatilememory of the computing device; determining whether the confidence levelof any of the image recognition models is above a confidence threshold;and upon determining that at least one confidence level of the imagerecognition models is above the confidence threshold, selecting at leastone of the image recognition models whose confidence level is above theconfidence threshold; performing the search within the target image fora target region of the target image using at least one selected imagerecognition model locally on the processor; and returning a searchresult to the user.
 12. The method of claim 11, wherein each imagerecognition model includes at least one of the following: an imagerecognition algorithm, an optical character recognition (OCR) algorithm,and a keyword matching algorithm.
 13. The method of claim 11, whereinthe image recognition program uses location information about the user.14. The method of claim 11, wherein the target image is an image orvideo of the user's current surroundings and the query indicates adesired product or destination.
 15. The method of claim 11, wherein thecomputing device comprises a camera and wherein the target image iscaptured by the camera.
 16. The method of claim 11, wherein thecomputing device is a smartphone or tablet.
 17. The method of claim 11,wherein the image recognition program displays the search result on ahead-up display.
 18. The method of claim 11, further comprising, upondetermining that none of the confidence levels of the image recognitionmodels is above the confidence threshold: connecting to a web serviceincluding a plurality of image recognition models; receiving a rankingof image recognition models of the web service by confidence level forperforming the search based on the query within the target image;determining whether any of the image recognition models of the webservice is above the confidence threshold for performing the search; andupon determining that at least one confidence level of the imagerecognition models of the web service is above the confidence threshold,selecting at least one of the image recognition models of the webservice whose confidence level is above the confidence threshold;downloading at least one selected image recognition model to thecomputing device; performing the search within the target image for atarget region of the target image using at least one downloaded imagerecognition model; and returning the search result to the user.
 19. Themethod of claim 11, further comprising, upon determining that none ofthe confidence levels of the image recognition models is above theconfidence threshold: performing a web image search in an image searchengine, wherein the web image search is based on the query; returning apredetermined number of example images; selecting at least one exampleimage from the predetermined number of example images; creating an imagerecognition model based on at least one selected example image; storingthe created image recognition model in the non-volatile memory of thecomputing device; performing the search within the target image for atarget region of the target image using the created image recognitionmodel; and returning the search result to the user.
 20. A computingdevice having adaptable image search, the computing device comprising:non-volatile memory configured to store a plurality of image recognitionmodels; a camera, wherein a target image is captured by the camera andwherein the target image is a single image or one or a plurality ofimage frames that constitute a portion of video; a head-up display; andan image recognition program executed by a processor of the computingdevice, the computing device being a user computing device, and theimage recognition program configured to: receive a query from a user,the query comprising text that is typed or converted from speech;receive the target image within which a search based on the query is tobe performed; rank the image recognition models by confidence level forperforming the search based on at least a comparison between the queryand respective text descriptions of the image recognition models;determine whether the confidence level of any of the image recognitionmodels is above a confidence threshold; and upon determining that theconfidence level of at least one of the image recognition models isabove the confidence threshold, select at least one of the imagerecognition models whose confidence level is above the confidencethreshold; perform the search within the target image for a targetregion of the target image using at least one selected image recognitionmodel locally on the processor; return a search result to the user; anddisplay the search result on the head-up display.